Determine the intrinsic parameters of 10 cameras using a dataset of checkerboard images.
The provided dataset contains images of a checkerboard pattern captured from 10 different cameras. These images will be used to calibrate each camera and find their intrinsic parameters. - Dataset https://drive.google.com/drive/folders/1g7M-IgGUQzDXCZ7XUEjWlApbBjPwXBq9?usp=drive_link
The script below performs camera calibration on multiple camera's and stores camera calibration parameters in the dictionary.
import numpy as np
import cv2
import glob
import matplotlib.pyplot as plt
from tqdm import tqdm
import re
# For counting frames with correctly identified chessboard pattern/frames used for calibration
count = 1
# Extracts the camera id from the image names
def get_cam_id(img_name):
return re.search(r"(\d+)$", img_name).group()
# Path to the folder containing the images
image_folder = 'camera data\\'
pattern_wise_object_points = {}
# Chessboard square size (in mm) can be used if the chessboard square size is known
# square_size = 25
# Image format
image_format = '.png'
# initializing combinations of pattern sizes for searching the optimal chessboard size to detect the chessboard pattern
# patterns are in descending order so that the bigger patterns are searched first to avoid detecting
# intermediate smaller chessboard patterns i.e. avoiding detecting 3x3 pattern within 8x8 chessboard
for i in range(8, 3, -1):
for j in range(8, 3, -1):
pattern_size = (i, j)
# Prepare object points (0,0,0), (1,0,0), (2,0,0) ...., (9,6,0)
objp = np.zeros((pattern_size[0]*pattern_size[1], 3), np.float32)
objp[:, :2] = np.mgrid[0:pattern_size[0], 0:pattern_size[1]].T.reshape(-1, 2)
# objp *= square_size
# storing pattern size wise object points
pattern_wise_object_points[pattern_size] = objp
# Get list of images in the folder
images = glob.glob(image_folder + '*' + image_format)
# intializing the dictionary for storing the calibration parametes per camera
camera_dict = {i:{"object_points": [], "image_points": []} for i in set([get_cam_id(img.split('.')[0]) for img in images]) }
# Loop through each image in the folder, tqdm used for creating progress bar
for fname in tqdm(images):
# Extracting the camera id from image name
cam_id = get_cam_id(fname.split('.')[0])
# Read the image
img = cv2.imread(fname)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# searching chessboard pattern in the current frame
for k, v in pattern_wise_object_points.items():
# Find the chessboard corners
ret, corners = cv2.findChessboardCornersSB(gray, k, None)
# If corners are found, add object points and image points against camera_id in the dictionary
if ret:
print('corners found!', k)
# Updating the object points in camera dictionary
temp_ojbect_points = camera_dict[cam_id]["object_points"]
temp_ojbect_points.append(pattern_wise_object_points[k])
camera_dict[cam_id]["object_points"] = temp_ojbect_points
# Updating the image points in camera dictionary
temp_image_points = camera_dict[cam_id]["image_points"]
temp_image_points.append(corners)
camera_dict[cam_id]["image_points"] = temp_image_points
# Visualize and show corners
img_corners = cv2.drawChessboardCorners(img, k, corners, ret)
plt.imshow(cv2.cvtColor(img_corners, cv2.COLOR_BGR2RGB))
plt.title(f'Chessboard Corners image no. {count}')
count += 1
plt.show()
# breaking since the chessboard is detected
break
0%| | 0/97 [00:00<?, ?it/s]
corners found! (7, 7)
2%|█▋ | 2/97 [00:05<04:26, 2.81s/it]
corners found! (7, 7)
4%|███▍ | 4/97 [00:09<03:24, 2.20s/it]
corners found! (7, 7)
6%|█████▏ | 6/97 [00:13<03:12, 2.12s/it]
corners found! (7, 7)
8%|██████▊ | 8/97 [00:18<03:45, 2.53s/it]
corners found! (7, 7)
9%|███████▋ | 9/97 [00:20<03:16, 2.23s/it]
corners found! (7, 7)
10%|████████▍ | 10/97 [00:22<03:15, 2.25s/it]
corners found! (7, 7)
12%|██████████▏ | 12/97 [00:26<03:02, 2.14s/it]
corners found! (7, 5)
13%|██████████▉ | 13/97 [00:28<02:50, 2.03s/it]
corners found! (7, 7)
14%|███████████▊ | 14/97 [00:30<02:40, 1.93s/it]
corners found! (8, 4)
19%|███████████████▏ | 18/97 [00:38<02:42, 2.06s/it]
corners found! (7, 7)
21%|████████████████▉ | 20/97 [00:41<02:32, 1.98s/it]
corners found! (7, 7)
23%|██████████████████▌ | 22/97 [00:46<02:32, 2.03s/it]
corners found! (7, 7)
24%|███████████████████▍ | 23/97 [00:48<02:28, 2.01s/it]
corners found! (7, 7)
25%|████████████████████▎ | 24/97 [00:49<02:19, 1.92s/it]
corners found! (7, 7)
26%|█████████████████████▏ | 25/97 [00:51<02:10, 1.82s/it]
corners found! (8, 4)
27%|█████████████████████▉ | 26/97 [00:52<01:59, 1.68s/it]
corners found! (7, 7)
28%|██████████████████████▊ | 27/97 [00:54<02:02, 1.75s/it]
corners found! (7, 7)
29%|███████████████████████▋ | 28/97 [00:56<01:54, 1.67s/it]
corners found! (8, 4)
30%|████████████████████████▌ | 29/97 [00:57<01:43, 1.52s/it]
corners found! (7, 4)
31%|█████████████████████████▎ | 30/97 [00:58<01:39, 1.49s/it]
corners found! (7, 7)
32%|██████████████████████████▏ | 31/97 [01:00<01:39, 1.50s/it]
corners found! (7, 7)
33%|███████████████████████████ | 32/97 [01:01<01:36, 1.49s/it]
corners found! (7, 7)
35%|████████████████████████████▋ | 34/97 [01:05<01:43, 1.65s/it]
corners found! (7, 7)
36%|█████████████████████████████▌ | 35/97 [01:06<01:44, 1.68s/it]
corners found! (7, 7)
37%|██████████████████████████████▍ | 36/97 [01:08<01:36, 1.58s/it]
corners found! (7, 7)
39%|████████████████████████████████ | 38/97 [01:12<01:44, 1.77s/it]
corners found! (7, 7)
40%|████████████████████████████████▉ | 39/97 [01:13<01:38, 1.71s/it]
corners found! (7, 7)
41%|█████████████████████████████████▊ | 40/97 [01:15<01:37, 1.71s/it]
corners found! (7, 5)
42%|██████████████████████████████████▋ | 41/97 [01:16<01:29, 1.60s/it]
corners found! (7, 7)
43%|███████████████████████████████████▌ | 42/97 [01:18<01:37, 1.78s/it]
corners found! (8, 4)
45%|█████████████████████████████████████▏ | 44/97 [01:23<01:48, 2.04s/it]
corners found! (7, 7)
46%|██████████████████████████████████████ | 45/97 [01:24<01:38, 1.90s/it]
corners found! (7, 7)
49%|████████████████████████████████████████▌ | 48/97 [01:30<01:34, 1.93s/it]
corners found! (7, 7)
53%|███████████████████████████████████████████ | 51/97 [01:36<01:32, 2.01s/it]
corners found! (7, 5)
54%|███████████████████████████████████████████▉ | 52/97 [01:37<01:23, 1.86s/it]
corners found! (7, 7)
55%|████████████████████████████████████████████▊ | 53/97 [01:39<01:18, 1.79s/it]
corners found! (7, 7)
57%|██████████████████████████████████████████████▍ | 55/97 [01:43<01:16, 1.82s/it]
corners found! (7, 7)
59%|████████████████████████████████████████████████▏ | 57/97 [01:46<01:13, 1.85s/it]
corners found! (7, 7)
62%|██████████████████████████████████████████████████▋ | 60/97 [01:52<01:12, 1.96s/it]
corners found! (7, 5)
64%|████████████████████████████████████████████████████▍ | 62/97 [01:56<01:06, 1.89s/it]
corners found! (7, 4)
65%|█████████████████████████████████████████████████████▎ | 63/97 [01:57<01:00, 1.79s/it]
corners found! (7, 7)
66%|██████████████████████████████████████████████████████ | 64/97 [01:59<00:56, 1.70s/it]
corners found! (7, 7)
67%|██████████████████████████████████████████████████████▉ | 65/97 [02:00<00:52, 1.65s/it]
corners found! (7, 7)
68%|███████████████████████████████████████████████████████▊ | 66/97 [02:02<00:50, 1.64s/it]
corners found! (7, 7)
69%|████████████████████████████████████████████████████████▋ | 67/97 [02:03<00:45, 1.52s/it]
corners found! (7, 7)
70%|█████████████████████████████████████████████████████████▍ | 68/97 [02:04<00:43, 1.50s/it]
corners found! (7, 7)
72%|███████████████████████████████████████████████████████████▏ | 70/97 [02:08<00:43, 1.62s/it]
corners found! (7, 7)
74%|████████████████████████████████████████████████████████████▊ | 72/97 [02:12<00:46, 1.85s/it]
corners found! (7, 7)
77%|███████████████████████████████████████████████████████████████▍ | 75/97 [02:18<00:45, 2.05s/it]
corners found! (7, 7)
78%|████████████████████████████████████████████████████████████████▏ | 76/97 [02:19<00:39, 1.88s/it]
corners found! (7, 7)
80%|█████████████████████████████████████████████████████████████████▉ | 78/97 [02:24<00:39, 2.10s/it]
corners found! (7, 7)
82%|███████████████████████████████████████████████████████████████████▋ | 80/97 [02:27<00:33, 1.99s/it]
corners found! (7, 7)
84%|████████████████████████████████████████████████████████████████████▍ | 81/97 [02:29<00:29, 1.86s/it]
corners found! (8, 4)
87%|███████████████████████████████████████████████████████████████████████ | 84/97 [02:35<00:25, 1.94s/it]
corners found! (7, 7)
89%|████████████████████████████████████████████████████████████████████████▋ | 86/97 [02:39<00:22, 2.08s/it]
corners found! (7, 7)
91%|██████████████████████████████████████████████████████████████████████████▍ | 88/97 [02:43<00:17, 1.99s/it]
corners found! (7, 7)
93%|████████████████████████████████████████████████████████████████████████████ | 90/97 [02:46<00:13, 1.98s/it]
corners found! (7, 7)
94%|████████████████████████████████████████████████████████████████████████████▉ | 91/97 [02:48<00:10, 1.82s/it]
corners found! (7, 7)
95%|█████████████████████████████████████████████████████████████████████████████▊ | 92/97 [02:50<00:09, 1.80s/it]
corners found! (7, 7)
96%|██████████████████████████████████████████████████████████████████████████████▌ | 93/97 [02:51<00:07, 1.76s/it]
corners found! (7, 7)
98%|████████████████████████████████████████████████████████████████████████████████▎ | 95/97 [02:56<00:04, 2.10s/it]
corners found! (7, 7)
100%|██████████████████████████████████████████████████████████████████████████████████| 97/97 [03:01<00:00, 1.87s/it]
# calibrating all cameras
for k, v in camera_dict.items():
print('camera id', k)
# Calibrate camera
ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(v['object_points'], v['image_points'], gray.shape[::-1], None, None)
# storing calibrated camera parameters in the dictionary against camera_id
camera_dict[k]['camera_matrix'] = mtx
camera_dict[k]['distortion_coefficients'] = dist
# Print the intrinsic parameters of the camera
print("Intrinsic Parameters (Camera Matrix):")
print(mtx)
print("\nDistortion Coefficients:")
print(dist)
camera id 8 Intrinsic Parameters (Camera Matrix): [[2.01226284e+03 0.00000000e+00 8.01532842e+02] [0.00000000e+00 1.95211779e+03 3.75133979e+02] [0.00000000e+00 0.00000000e+00 1.00000000e+00]] Distortion Coefficients: [[ 1.47409711e+00 -2.57604182e+01 5.79125640e-03 -1.13321127e-01 2.38245407e+02]] camera id 10 Intrinsic Parameters (Camera Matrix): [[1.82508231e+04 0.00000000e+00 6.39578331e+02] [0.00000000e+00 1.82759528e+04 3.58947813e+02] [0.00000000e+00 0.00000000e+00 1.00000000e+00]] Distortion Coefficients: [[ 2.12263462e-01 -1.12365892e+03 6.60530850e-02 -8.24250361e-03 -1.52105505e+00]] camera id 9 Intrinsic Parameters (Camera Matrix): [[1.31261953e+04 0.00000000e+00 6.37653704e+02] [0.00000000e+00 1.29957836e+04 3.59145520e+02] [0.00000000e+00 0.00000000e+00 1.00000000e+00]] Distortion Coefficients: [[-4.70966847e+00 2.55999626e+03 3.41920036e-02 1.33462427e-02 5.89385124e+00]]
Given images from 10 fixed cameras of a pitch with known dimensions, estimate the 3D coordinates of a moving ball, the pose of each camera, and predict the 3D trajectory of the ball for applications like LBW decision in cricket.
Share the tentative algorithms to do the same. For 3D estimations don't use Deep learning based techniques. Use of visual diagrams and code snippets is always welcome.
Bonus points for giving methods to optimize the measurements post all calculations.
since the cameras are calibrated and assuming the cameras are synchronised such that the captured frame from all of the cameras shows the ball at same time instance
Firstly we will require to detect the ball in the camera frame for that a detection model can be used detection model returns class and 2D image coordinates
The image coordinates shows the location of the ball in the 2D space with respect to the current frame, this is of no use in 3D coordinates as image coordinates cannot be used "as is" in real word 3D coordinates
To translate 2D image coordinates into 3D real world coordinates multi camera setup can be used. to do that a origin can be considered as a center of the pitch and X, Y, Z axis accordingly.
One camera gives 2D information of the ball however if more than one camera is used then the 3D location of the ball can be calculated using "triangulation".
to put in simple words if we consider a pitch as a cuboid shaped having length x width x height. consider there are two cameras one facing straight to batsman from non-striker end. and other facing from side such a way bowler is at left hand side and batsman is at right hand size of the frame. in this case triangulation will work in follwing way
Objective: Detect a small object (the ball) in the frames captured by the camera system. Assignment Details:
1. Detection Algorithm/Architecture:
detecting small objects like a cricket ball in the provided images. Justify your choice.
2. Submission Requirements:
If you want to modify existing models it is encouraged, and provide rationale if you do it from scratch.
Object Detection is a pretty straight forward process and can be done using models architecture however when it comes to detecting small objects it becomes a difficult task as the small object can loose it's feature when we resize the image for model inference. Small object detection is difficult task though it is possible to build a system to detect the ball with higher accuracy
software based
hardware based
combination of software and hardware based optimisations can be used to optimise the final model
Determine the ideal FPS (Frames Per Second) and shutter speed for capturing the ball's motion clearly, and specify the precision required in camera synchronization for effective multi-camera analysis.
highest ball bowled so far is 161.3 kmph by Shoaib Akhtar, i.e. 44.8 m/s. This means the ball travels ~45 meters in a second. if we capture that ball with 1 fps will see the ball as 45 meter long trail due to the motion blur. to get the best picture ball must be stationary this is not going to be the case here as the ball is going to be played.